Visualizing bivariate long-tailed data
نویسندگان
چکیده
منابع مشابه
Visualizing bivariate long-tailed data
Variables in large data sets in biology or e-commerce often have a head, made up of very frequent values and a long tail of ever rarer values. Models such as the Zipf or Zipf–Mandelbrot provide a good description. The problem we address here is the visualization of two such long-tailed variables, as one might see in a bivariate Zipf context. We introduce a copula plot to display the joint behav...
متن کاملDisplaying Bivariate Data
Numerical techniques are too often designed to yield specific answers to rigidly defined questions. Graphical techniques are less confining. They aid in understanding the numerous relationships reflected in the data. They help reveal the existence of peculiar looking observations or subsets of the data. It is difficult to obtain similar information from numerical procedures. In this article, by...
متن کاملEfficient fitting of long-tailed data sets into PH distributions ?
We propose a new technique for fitting long-tailed data sets into phase-type (PH) distributions. This technique fits data sets with non-monotone densities into a mixture of Erlang and hyperexponential distributions, and data sets with complete monotone densities into hyperexponential distributions. The method partitions the data set in a divide and conquer fashion and uses the Expectation-Maxim...
متن کاملEfficient fitting of long-tailed data sets into hyperexponential distributions
We propose a new technique for fitting long-tailed data sets into hyperexponential distributions. The approach partitions the data set in a divide and conquer fashion and uses the Expectation-Maximization (EM) algorithm to fit the data of each partition into a hyperexponential distribution. The fitting results of all partitions are combined to generate the fitting for the entire data set. The n...
متن کاملRecognizing and visualizing departures from independence in bivariate data using local Gaussian correlation
It is well known that the traditional Pearson correlation in many cases fails to capture non-linear dependence structures in bivariate data. Other scalar measures capable of capturing non-linear dependence exist. A common disadvantage of such measures, however, is that they cannot distinguish between negative and positive dependence, and typically the alternative hypothesis of the accompanying ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Electronic Journal of Statistics
سال: 2011
ISSN: 1935-7524
DOI: 10.1214/11-ejs622